Skip to content

Conversation

@fmount
Copy link
Contributor

@fmount fmount commented Oct 17, 2025

Before this patch, the updates topology envTest was a single block of instructions, but, provided that the reconcile loop moves forward and certain conditions are satisfied, we actually need to check each of them.
This patch attempts to split the Eventually block so that:

  • Each assertion gets its own retry logic (based on timeout, interval)
  • Prevents early assertion failures from masking later issues

This approach has been applied already to other operators (e.g. Nova) where there are many controllers trying to apply/remove finalizers to/from the same resource.

@fmount fmount requested a review from abays October 17, 2025 10:00
@openshift-ci openshift-ci bot requested a review from olliewalsh October 17, 2025 10:00
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fmount
Once this PR has been reviewed and has the lgtm label, please assign viroel for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fmount fmount changed the title Fix envTest topologyRef race condition Wip - Fix envTest topologyRef race condition Oct 17, 2025
@fmount
Copy link
Contributor Author

fmount commented Oct 17, 2025

/test functional

2 similar comments
@fmount
Copy link
Contributor Author

fmount commented Oct 17, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 17, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 19, 2025

/test functional

7 similar comments
@fmount
Copy link
Contributor Author

fmount commented Oct 19, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 19, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 19, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 19, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 19, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 19, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 20, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 20, 2025

The ci-scheduling-dns-wait container could not start because it could not pull "registry.access.redhat.com/ubi8". Check your images. Full message: "Back-off pulling image \"registry.access.redhat.com/ubi8\": ErrImagePull: unable to pull image or OCI artifact: pull image err: initializing source docker://registry.access.redhat.com/ubi8:latest: reading manifest latest in registry.access.redhat.com/ubi8: received unexpected HTTP status: 502 Bad Gateway; artifact err: get manifest: build image source: reading manifest latest in registry.access.redhat.com/ubi8: received unexpected HTTP status: 502 Bad Gateway"

last failure is unrelated as Prow wasn't able to start the job.

@fmount
Copy link
Contributor Author

fmount commented Oct 20, 2025

/test functional

3 similar comments
@fmount
Copy link
Contributor Author

fmount commented Oct 23, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 23, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 23, 2025

/test functional

@fmount fmount force-pushed the toporef_envtest branch 2 times, most recently from 200f72e to 0de0624 Compare October 23, 2025 09:36
@fmount
Copy link
Contributor Author

fmount commented Oct 23, 2025

/test functional

2 similar comments
@fmount
Copy link
Contributor Author

fmount commented Oct 23, 2025

/test functional

@fmount
Copy link
Contributor Author

fmount commented Oct 23, 2025

/test functional

@fmount fmount changed the title Wip - Fix envTest topologyRef race condition Fix envTest topologyRef race condition Oct 23, 2025
@fmount
Copy link
Contributor Author

fmount commented Oct 23, 2025

/test functional

1 similar comment
@fmount
Copy link
Contributor Author

fmount commented Oct 23, 2025

/test functional

finalizers := prevTopology.GetFinalizers()
g.Expect(finalizers).To(BeEmpty())
}, timeout, interval).Should(Succeed())
}, 20*timeout, interval).Should(Succeed())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the 20x timeout here given that we think we know what is going on? I just wonder if you had added this strictly for testing purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey no, I think I simply forgot that value here! Good catch, updating the patch right away!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Done)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird, while locally it works after removing the 20 * timeout, in CI it fails.

Before this patch, the "updates topology" envTest was a single block of
instructions, but we actually need to check each of them provided that
the reconcile loop moves forward and certain conditions are satisfied.
This patch attempts to split the Eventually block so that:

- Each assertion gets its own retry logic (based on timeout, interval)
- Prevents early assertion failures from masking later issues

This approach has been applied already to other operators (e.g. Nova)
where there are many controllers trying to apply/remove finalizers to
the same resource.

Signed-off-by: Francesco Pantano <[email protected]>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 24, 2025

@fmount: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/functional c7eb9d4 link true /test functional

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants